00 Mean squared error (MSE) loss ("regression loss")

MSE loss is one of the most common loss functions in all of machine learning. The MSE loss is given as

The constant makes it a proper mean, but it’s just a scaling constant. As such, it is sometimes altered to suit a given application (as when deriving a closed-form solution for linear regression).

MSE loss was used in Rumelhart, Hinton, and Williams (1986), the “original” backpropagation paper. It remains a top choice for regression problems in deep learning. (Notice that minimizing the sum of squares over a linear predictor is equivalent to computing the linear regression.)

It is not as common for classification tasks, where we are estimating an unknown probability mass function based on a (potentially inadequate) sample. In these cases, it is often preferable to incorporate an explicit measure of information captured, so cross-entropy loss is generally preferred.

David's raw ML reference notes

Explorer

00 Mean squared error (MSE) loss ("regression loss")

Graph View

Backlinks